class: center, middle, inverse, title-slide # Module 6 - Presentation and Visualization ## EDSP Challenge Mentoring ### Marck Vaisman --- <style type="text/css"> # This section removes all the external social media links from the share bar # created by xaringanExtra::use_share_again() .shareagain-bar{ --shareagain-foreground: rgb(255, 255, 255); --shareagain-background: rgba(0, 0, 0, 0.5); --shareagain-twitter: none; --shareagain-facebook: none; --shareagain-linkedin: none; --shareagain-pinterest: none; --shareagain-pocket: none; --shareagain-reddit: none; } </style> ## Contents .pull-left[ 1. The difference between designing for you vs. for designing for an audience 1. Choosing the right visualization * All about asking questions * Decomposing your chart * Understanding encodings 1. Visualization critique discussion 1. Final conceptual and design considerations * Making readable graphics * Deep dive into Tufte's principles * Bringing it all together. ] .pull-right[ ### Litarate programming * https://en.wikipedia.org/wiki/Literate_programming * http://literateprogramming.com/index.html * https://jupyter.org/ * https://css-skills.uchicago.edu/posts/2021-11-16-literate-programming-with-r-markdown/ ] --- class: inverse, center, middle # Let's begin --- # Data visualization is both _art_ and _science_. .pull-left[ **Art** <img src="https://media.giphy.com/media/gVJKzDaWKSETu/giphy.gif"> ] .pull-right[ **Science** <img src="https://media.giphy.com/media/VIo556t5920j07cCR4/giphy.gif"> ] --- # Data <----> grammar .pull-left[ <img src="https://media.giphy.com/media/l2Je66zG6mAAZxgqI/giphy.gif"> ] .pull-right[ * Data visualization is as valuable to anyone working with data as grammar is to anyone working with words * Just as you should not write an essay without proper grammar, you should not create a graph without first mastering data visualization best practices ] --- # Visualization is an iterative process <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2016/09/visualization-process-720x561.jpeg"> </div> --- ## How _thought_ leadership helps drive data science impact .pull-left[ * Data **alone** is not **insight**. * Humans can tell a better story than data can by itself * Numbers rarely speak for themselves. Need context * Data scientist with storytelling skills have greater business impact * Influencing for impact often comes down to conveying a compelling narrative around data and what it means * Data scientists who develop this skill typically have the edge in getting their work noticed and acted upon, and enhance their abilities to be acknowledged as an expert by peers and leaders ] .pull-right[ In your work as data scientists, in addition to doing modeling and machine learning work, you will be responsible (either individually or as part of a team) for providing the following as part of a project: * **Findings:** what does the data say? * **Conclusions:** what is your interpretation of the _findings_? * **Recommendations:** what can be done as a result of the _findings_ and _conclusions_? ] .footnote[Source: [MLADS Talk](https://msit.microsoftstream.com/video/968b0840-98dc-8491-7401-f1ec431e3f3f?channelId=8fc90840-98dc-8491-ce0d-f1ec433939d7 )] --- # Practice, practice, practice <div align="center"> <img src="https://media.giphy.com/media/Y2c1ZjXVHRdSr1YojF/giphy.gif"> </div> --- class: inverse, middle, center # Why we visualize and a _very_ short history of dataviz --- # Visualization takes advantage of our human capability to understand visual patterns **quickly** and often **intuitively** <div align="center"> <img src="img/visualization.png"> </div> --- class: inverse, center, middle # Reasons for visualizing data --- # Spotting trends <div align="center"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/2/27/Snow-cholera-map-1.jpg/1024px-Snow-cholera-map-1.jpg" width=450> </div> .footnote[John Snow cholera clusters in London, 1854] --- # Spotting trends <div align="center"> <img src="https://upload.wikimedia.org/wikipedia/commons/1/17/Nightingale-mortality.jpg" width=600> </div> .footnote[Florence Nightingale, Diagram of the causes of mortality in the army in the East] --- # Analyzing and exploring They Rule is a website that allows you to create maps of the interlocking directories of the top 100 companies in the US in 2001. <div align="center"> <img src="img/theyrule.png" width=600> </div> --- # Telling a story <div align="center"> <iframe width="766" height="431" src="https://www.youtube.com/embed/jbkSRLYSojo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> </div> --- background-image: url("data:image/png;base64,#https://media.giphy.com/media/3o85gdhlpxVz8TjsTC/giphy.gif") background-size: cover --- class: inverse, center, middle # Classical data summaries can lie! --- # Four datasets with identical properties <div align="center"> <img src="img/anscombe-summary.png"> </div> --- # Anscombe Quartet <div align="center"> <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png" width=500> </div> --- # Thirteen datasets with identical properties <div align="center"> <img src="img/dino-summary.png"> </div> --- # Datasaurus <div align="center"> <img src="img/dino-all.png" height=450, width=750> </div> .footnote[ Matejka & Fitzmaurice, 2017 ] --- # Making the Datasaurus <div align="center"> <img src="https://d2f99xq7vri1nk.cloudfront.net/AllDinosAnimatedSmaller.gif"> </div> .footnote[ Matejka & Fitzmaurice, 2017 ] --- # Summaries don't differentiate <div align="center"> <img src="https://d2f99xq7vri1nk.cloudfront.net/BoxViolinSmaller.gif"> </div> .footnote[ Matejka & Fitzmaurice, 2017 ] --- class: inverse, middle, center # A _very_ short history of data visualization --- - Astronomical data presentation for navigation - 1700s: Rene Descartes - 1800-1900: Graphs and pie charts (WilliamPlayfair) - 1913: Iowa State University: Introduced some of the first courses in “graphing” data 1. 1977: Princeton University: - Statistics Professor John Tukey Developed the first exploratory data analysis (EDA) using visualizations. - 1983: Edward Tufte published “The Visual Display of Quantitative Information” which showed effective visualization methods. - 1984, Apple Computer introduced the first popular and affordable computer that focused in graphics (GIU) as a mode of interaction and display. This was huge and persists today. - 1999: The words, “information visualization” were so first named in the book: “Readings in Information Visualization: Using Vision to Think”, Card, Mackinlay, Shneiderman. - Around 2000ish - : In Few’s opinion, the IBM PC detracted from the value of making graphs by hand as it offers a mouse and computer-application based option. When people made graphs by hand, they took the time to be responsible. --- # Evolution of computing visualization tools - Hand drawn - 1970s: CAD/CAM - 1980s: Scientific visualization, business visualizations (Harvard Graphics) - 1990s: Excel, Powerpoint, R - 2000s: Open source, interactive, web --- class: inverse, center, middle # Back to our original programming... --- # Think about when you first get a dataset... what might happen? -- - You don't know what to expect when you first open up a data file -- - You make, remake and review summary charts to get a sense of what you are dealing with -- - You throw the dataset into some kind of automatic tool for a quick overview -- - You see something odd or interesting and you poke some more in that area --- class: inverse, center, middle # a-ha! --- class: center,middle .ba.bw2.br3.bg-washed-blue.b--blue.ph4[ .f3[ Exploratory data analysis is detective work—numerical detective work—or counting detective work—or graphical detective work. ] .tr[ John Tukey, 1997 Exploratory Data Analysis ] ] --- class: inverse .f2.b--dark-blue.ba.bw2.br3.ph4.mt5[ You kind of know what you’re looking for, but you don’t know what you’re going to find yet. You work with your bag of tools through the available resources. ] -- # You then present to your team... --- background-image: url("data:image/png;base64,#https://i.insider.com/51cb25fa69beddcd4f000005?width=700&format=jpeg&auto=webp") background-size: cover --- class: middle, center <div align="center" class="tenor-gif-embed" data-postid="13045781" data-share-method="host" data-width="70%" data-aspect-ratio="1.447674418604651"><a href="https://tenor.com/view/fml-sylvester-cat-annoyed-head-bang-gif-13045781">Fml Sylvester GIF</a> from <a href="https://tenor.com/search/fml-gifs">Fml GIFs</a></div><script type="text/javascript" async src="https://tenor.com/embed.js"></script> --- # Oh no! -- - Your labels are not aligned -- - You have to tilt your head to see the data in the right way -- - The chart type you have chosen is not ideal -- - Your color choice is not color-blind friendly (and your customer is color blind) -- - You used ComicSans font -- - A variable is named _%$%#@!!_ -- - Your chart doesn't show properly on a mobile device and the CEO is looking at it from the beach --- background-image: url("data:image/png;base64,#https://media.giphy.com/media/G5X63GrrLjjVK/giphy.gif") background-size: contain class: inverse, center, bottom # Who cares, right? --- # Two broad things you **MUST** care about: .pull-left[ <img src="https://media.giphy.com/media/l2YWAC4zYseVWIh8I/giphy.gif"> ] .pull-right[ * How your chart looks for your own use * How your chart will work for your audience ] --- class: inverse, center, middle # Adjusting for differences between visualization for analysis and for an audience --- class: middle .f3.ba.bw2.br3.ph4.mt4.b--dark-blue[ A reader who lands on your chart (and the underlying data) may not have the same luxury of developing and answering questions like you did. Some might know little about data or making sense of it. Some might know more, but they don’t want to analyze the dataset. They want to know the results. ] --- ## With this in mind, let’s look over the main differences between you making charts for people to consume versus you using charts for analysis. .pull-left[ **Visualization for analysis** * tool for understanding datasets * you ask questions and quickly answer them * iterate to develop insights ] .pull-right[ **Visualization for presentation** * designed to communicate something useful * can be a form of entertainment ] --- # Four ways to adjust for these differences: 1. Explain the encodings 1. Provide context 1. Focus on readability 1. Develop aesthetics --- # Explain the encodings .pull-left[ <div> <img src="https://flowingdata.com/wp-content/uploads/2020/03/Rank-of-populous-cities-1090x754.png"> </div> ] .pull-right[ <div> <img src="https://flowingdata.com/wp-content/uploads/2020/03/annotation-750x484.png"> </div> ] .footnote[The Statistical Atlas of the United States, produced in the late 1800s] ??? When you don’t know if people can read your chart, explain all of the encodings. What scale are you using? What does that color represent? Is this normal? It’s better to err on the side of too much explanation than it is too little. At least with the former, people can gloss over the details if they’re already familiar. They can still read the chart. With the latter, people who are unfamiliar with the visual encodings will get stuck. Rewind a couple of centuries when charts were less common. The Statistical Atlas of the United States, produced in the late 1800s by the Census Bureau, explained all of the encodings. For example, look at this bump chart from the 1880 atlas. It ranks cities by population: --- # Provide context When readers can decode the shapes, colors and geometries on your chart, you are more than half way there to producing an awesome chart. However, **readers also need to understand the context of the data.** <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2019/03/Texting-history-750x629.png" width=500> </div> ??? Look at the comments --- # Another context example <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2017/10/Heart-rate-during-layoff-720x425.png" width=720> </div> The chart itself is not novel or unique, but the annotations make it relevant and contextual. ??? Drawing the lines to show milestones: head about layoff, met with HR, left work, etc. --- # Improve readability Charts should read like text. At the most basic level, it should be obvious what the chart is about and how to interpret it. <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2020/07/flat-750x525.png" width=600> </div> ??? Cluttered Think about flipping through the pages of a book. There are chapters. There are paragraphs. There are breaks in between large bodies of text. Sometimes there are pictures to illustrate ideas. These fit within the page and tend to flow from left to right and top to bottom. Strip that organization away, and you have a bunch of words on a page with no apparent beginning or end. With visualization for an audience, you want to provide readability through design. A little bit of alignment and organization can go a long way. For example, here’s a chart with annotation layered on top of the visual encodings. It shows median assets and debt by age: --- # Improve readability <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2020/07/organized-750x525.png"> </div> --- # Develop aesthetics .pull-left[ * Default setting in the tools are generic and designed in sucha a way that they would work with many datasets and visualization types * You can (and should) develop _aesthetics_ (your own visual style) to make your charts less ugly ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2013/05/Tufte-example.gif" width=300> ] --- # Develop aesthetics (continued) <div align="center"> <img src="https://flowingdata.com/wp-content/uploads/2015/10/Feltron-Report-1090x681.jpg" width=600> </div> ??? Nicholas felton data about himself --- # Using these guidelines * They’re more continuous than binary. Your charts might need more or less explanations, more or less context, etc. * Depends on your audience and the purpose behind your chart. If your audience is a small group who has the same background as you, then you might not need to provide as much context for the data you show. If your audience is already excited about a dataset, then you probably don’t need to make it too flashy. If you make charts for a research paper, there are probably publisher guidelines that you need to follow, which limits what you can do (sometimes a good thing). * Think of the above adjustments as _continuous knobs_ that you can turn up or down. The more charts you make, the better you’ll get at deciding how much to turn. --- class: inverse, center, middle # Basic design rules for making charts <img src="https://flowingdata.com/wp-content/uploads/2010/07/0-basic-graph1.jpg"> --- # Two leading figures .pull-left[ Ed Tufte <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Edward_Tufte_-_cropped.jpg/800px-Edward_Tufte_-_cropped.jpg" width=300> ] .pull-right[ Nathan Yau <img src="https://datastori.es/wp-content/uploads/2018/09/Nathan-Yau-nathan-yau.jpg" width=300> ] --- class: middle, center, inverse > Design is choice. The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. Most principles of design should be greeted with some skepticism, for word authority can dominate our vision, and we may come to see only through the lenses of word authority rather than with our own eyes. > > --- Edward Tufte, <cite>The Visual Display of Quantitative Data</cite> --- ## Tufte's Principles of Graphical Integrity 1. Show data variation, not design variation 1. Do not use graphics to quote data out of context 1. Use clear, detailed, thorough labelling. 1. Representation of numbers should be directly proportional to numerical quantities 1. Don't use more dimensions than the data require --- ## Tufte's Principles of Graphical Integrity 1. Show data variation, not design variation - Don't get fancy, let the data speak 1. Do not use graphics to quote data out of context - Maintain accuracy 1. Use clear, detailed, thorough labelling. - Use annotations to make your point 1. Representation of numbers should be directly proportional to numerical quantities - This is essential for fair representation 1. Don't use more dimensions than the data require - Be appropriate in use of 3D graphics, for example --- ## Tufte's Fundamental Principles of Design 1. Show comparisons 1. Show causality 1. Use multivariate data 1. Completely integrate modes (like text, images, numbers) 1. Establish credibility 1. Focus on content --- # Nathan Yau's Seven Basic Rules for Making Charts and Graphs 1. Check the data 1. Explain encodings 1. Label axes 1. Include units 1. Keep your geometry in check 1. Include your sources 1. Consider your audience <div style='text-align: right'> Nathan Yau, Flowing Data<br> <a href="https://flowingdata.com/2010/07/22/7-basic-rules-for-making-charts-and-graphs/">https://flowingdata.com/2010/07/22/7-basic-rules-for-making-charts-and-graphs/</a> </div> --- .pull-left[ ### 1) Check the data <img src="https://flowingdata.com/wp-content/uploads/2010/07/1-check-the-data.jpg"> * This should be obvious * If your data is weak, your chart is weak * Start with simple graphs to see if there are any outliers ] -- .pull-right[ ### 2) Explain encodings <img src="https://flowingdata.com/wp-content/uploads/2010/07/2-explain-encodings.jpg"> * Don't assume the reader knows what everything means * Provide a legend * Label shapes * Explain color scales ] --- .pull-left[ ### 3) Label axes <img src="https://flowingdata.com/wp-content/uploads/2010/07/3-labels-axes.jpg"> * Axes without labels or explanation are just decoration * Describe the scale (incremental, exponential, logarithmic?) * Have axes values start at zero ] -- .pull-right[ ### 4) Include units <img src="https://flowingdata.com/wp-content/uploads/2010/07/4-include-units.jpg"> * Numbers without units are meaningless * Remove the guesswork ] --- .pull-left[ ### 5) Keep your geometry in check <img src="https://flowingdata.com/wp-content/uploads/2010/07/5-keep-geometry-in-check.jpg"> * This is something that is immediately noticeable * Don't use area to compare two units unless they are an area. An increase in a unit squares the area. * Tip: size circles and other 2D shapes by area, unless it's a bar chart ] -- .pull-left[ ### 6) Include your sources <img src="https://flowingdata.com/wp-content/uploads/2010/07/6-sources.jpg"> * This is another obvious one * Always include the source of your data * Makes your graphic more reputable * Allows for others to dig deeper ] --- .pull-left[ ### 7) Consider your audience <img src="https://flowingdata.com/wp-content/uploads/2010/07/7-audience.jpg"> * What purpose do your charts have and who are they for? * Avoid quirky fonts * Make good design choices ] --- class: inverse, middle, center # Choosing the right chart type --- ## Why do we visualize data? .fl.w-60.b--solid.ph4[ **Record** information - Blueprints, photographs, seismographs, ... **Analyze** data to support reasoning - Develop and assess hypotheses - Find patterns and discover anomalies in data **Communicate** information to others - Share and persuade - Collaborate and revise ] --- ## What is the best way to visualize your data? -- * What do you want to show? - What do you want to emphasize? -- * Why do you want to show it? - What is the message you want to convey -- * Who are you showing it to? - Understand what your audience will be receptive to - What is _their_ context? --- ## Is choosing the right visualization a straightforward choice? -- .pull-left[ ### Smaller datasets * Look at the data * Use multiple looks to understand the data * Choose which patterns you want to visualize ] -- .pull-right[ ### Larger datasets * Use random sampling to look at smaller sub-samples * Experiment * Methods are advancing to enable big data visualization (later this semester) ] .fl.w-70[] .fl.w-30[<img src="img/big_data.png"></img>] --- ## The chart selection process is _not_ mechanical ### Just as you can't * randomly place a bunch of words together to make a book * randomly record videos and get a finished film out of them * randomly grab ingredients from the pantry, toss them in the pan and expect a great meal... -- ### You cannot just put a chart together as a sequence of steps. -- ### However, there is still a method and a mental model --- ## Ask and answer questions .pull-left[ * There are many different ways to express a story from data - Blind men and the elephant (different perspectives) - Changing vantage points (different views) - You can change your vantage point and how you want to see the data - Nathan Yau shows [25 ways to see a data](https://flowingdata.com/2017/01/24/one-dataset-visualized-25-ways/) * Meaningful analysis requires - context, - background, and - a human in the loop * Different questions can lead to different chart types and focus ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2019/04/Better-questions-for-data-analysis-750x804.png" width=400> ] --- class: inverse, center, middle # Choosing your data format --- ## Recipes for selecting the right chart [From Data to Viz by Yan Holtz and Conor Healy](https://www.data-to-viz.com/) [The Data Viz Project by ferdio](https://datavizproject.com/) [Multiple views on how to choose a visualization by Steven Franconeri](http://experception.net/Franconeri_ExperCeptionDotNet_DataVisQuickRef.pdf) [Slide Chooser by Andrew Abela](https://extremepresentation.typepad.com/blog/2015/01/announcing-the-slide-chooser.html) [The Graphic Continuum by Jon Schwabish and Severino Ribecca](https://policyviz.com/2014/09/09/graphic-continuum/) ---- However, this _isn't a "if this, then that" scenario_ There can be multiple views that show different aspects of the data All can be useful, and equally "correct" -- #### The real question is, does the visualization convey your story in a way that is accurate and that your audience can receive, digest and understand --- class: middle, center, inverse # Creating a chart by splitting it into components --- ## No chart is made completely in a single pass * A chart is not a single **monolithic** element, so don't think of it as one * Perhaps this thought (single element) may work for standard charts like bar charts, line charts and scatterplots because most software tools provide quick ways of creating them, with _reasonable_ defaults * What do you do when even a basic chart or a single element is off? -- ### You split the chart into components -- - The basic mental model is that charts are **compositional** - There are building blocks and ways to put them together - If you understand the relevant parts, you can **compose** charts by mixing and matching and layering and joining -- .f1.red.center[This is a very powerful model] --- ## _Plane_ and _retinal variables_ A **plane** is like the coordinate system that defines how geometries are placed in a space. A **retinal variable** defines how to encode data into visuals. .center[<img src="https://flowingdata.com/wp-content/uploads/2018/09/Bertin-components-750x670.png" width=400>] .small[Jacques Bertin, _Semiology of Graphics_, 1967] --- ## The Grammar of Graphics William S. Cleveland, in his 1994 book _The Elements of Graphing Data_, lists the “basic elements of graph construction” as **scales, captions, plotting symbols, reference lines, keys, labels, panels, and tick marks**. In _The Grammar of Graphics_, published in 2005, Leland Wilkinson built off the work by Bertin and more formally defined the components of a graphic: Statistical graphic specifications are expressed in six statements: | Statement | Description | | --------|-------------| | DATA | a set of data operations that create variables from datasets | | TRANS| variable transformation (e.g. rank)| | SCALE| scale transformations (e.g. log)<br/>| | COORD| a coordinate system (e.g. polar)<br/>| | ELEMENT| graphs (e.g. points) and their aesthetic attributes (e.g. color)<br/>| | GUIDE| one or more guides (axes, legends, etc.)| **Hadley Wickham implemented Wilkinson’s grammar in R with the popular `ggplot2` package.** --- ## Strategies for breaking charts into individual components * The **data** drives all decisions - The purpose is to convey the **information** in the data * The **visual encodings** dictate the geometry and/or colors of a graphic - This forms the **aesthetics** of the visualization - This most influences how the visualization is received * The **coordinate system** (Cartesian, polar, or geographic) specifies the space in which the visual encodings reside. - This provides the canvas, scales and orientations upon which we visualize * The **context** communicates what the data is about, where it is from, and why it exists. - This can be provided through textual annotations, legends, etc. --- ## Example 1: breaking up a chart .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/get-together-single-bar-750x741.png" width=450>] .small[A study on how people first met: [https://data.stanford.edu/hcmst2017](https://data.stanford.edu/hcmst2017)] --- ## The _data_ is aggregated percentages ```r waymet n p 18 business_trip 7.260476 0.002214773 17 single_serve_nonint 27.895815 0.008509483 19 work_neighbors 35.471720 0.010820476 16 vacation 40.959558 0.012494514 8 mil 65.504234 0.019981748 15 blind_date 93.696132 0.028581549 ... ``` --- ## The bars are the _visual encoding_ .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/get-together-single-bar-encoding-750x741.png" width=450>] .small[Length represents a percentage] --- ## The _coordinate system_ is Cartesian .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/get-together-single-bar-scale-750x741.png" width=450>] .small[A _linear_ scale on the horizontal and a _categorical_ scale on the vertical axis] --- ## Additional information provides _context_ .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/get-together-single-bar-context-750x741.png" width=450>] .small[Titles, labels, markings, etc.] --- ## Example 2 .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/spending_main_categories-1090x906.png" width=500>] .small[A _mosaic plot_ that shows average spending by income group in the United States] --- ## The _data_ is the average dollar amount for each income group ```r item hierarchy_level all Less than $15,000 $15,000 to $29,999 $30,000 to $39,999 $40,000 to $49,999 $50,000 to $69,999 $70,000 to $99,999 $100,000 to $149,999 $150,000 to $199,999 $200,000 and more Food and Beverage 1 7216 3771 4453 5226 6040 6744 8453 10362 13571 16105 Food at Home 2 4049 2450 2904 3064 3656 3893 4772 5554 6718 7135 Food Away From Home 2 3154 1318 1533 2157 2371 2847 3664 4797 6832 8919 Alcoholic Beverages 2 484 133 215 280 320 420 596 734 1169 1659 Housing 1 18886 9698 12268 14533 15575 17331 20564 26003 33319 46076 ... ``` --- ## What are the _visual encodings_? .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/spending_main_categories-encodings-750x623.png" width=450>] -- .small[Height: percentage, width: average total spending for each income group, color: spending category] --- ## What is the coordinate system? .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/spending_main_categories-coordinates-750x623.png" width=450>] -- .small[Cartesian: x-axis is total dollars spend, y-axis is 0-100% for the group] --- ## What is the _context_? .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/spending_main_categories-context-750x623.png" width=450>] --- ## Let's decompose this one .pull-left[ <img src="https://flowingdata.com/wp-content/uploads/2018/07/How-America-uses-its-land-750x496.png">] .pull-right[ **Discuss** * Data * Visual encodings * Coordinate system * Context ] --- class: inverse, middle, center # Visual Encodings --- ## Visual encodings can be categorized into the main groups below .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/visual-encodings-1-1090x478.png" width=700>] .small[All visualizations use some combination of these] --- ## Example: a _scatterplot_ uses position on an x-y scale .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/scatter-plot-guide-750x443.png">] --- ## A _bar chart_ uses length to show values .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/bar-chart-type-750x360.png" align="center">] --- ## Some encodings work better than others .pull-left[ <img src="https://flowingdata.com/wp-content/uploads/2010/03/errors.png"> ] .pull-right[ **Decoding Error** 1. Position along a common scale 1. Length 1. Angle and slope 1. Area 1. Volume, density, and color saturation 1. Color hue ] --- .pull-left[ ## Position _Position_ allows you to compare values based on where they are placed with reference to a coordinate system. ### Considerations * Be aware of the scales you are using (linear vs logarithmic) - The scale changes the interpretation of distance - It can also change the perceived patterns ] .pull-right[ <img src="data:image/png;base64,#module6_files/figure-html/unnamed-chunk-15-1.png" width="2400" style="display: block; margin: auto;" /><img src="data:image/png;base64,#module6_files/figure-html/unnamed-chunk-15-2.png" width="2400" style="display: block; margin: auto;" /> ] --- .pull-left[ ## Position _Position_ allows you to compare values based on where they are placed with reference to a coordinate system. ### Considerations * Avoid overplotting since many points can occupy the same space and obscure one another ### Solutions - **Use transparency** so that overlapping points make darker areas - **jitter** (add noise so points no longer are on top of each other) - **Use binning** to show aggregate data per pixel ] .pull-right[ <img src="data:image/png;base64,#module6_files/figure-html/unnamed-chunk-16-1.png" width="2400" style="display: block; margin: auto;" /><img src="data:image/png;base64,#module6_files/figure-html/unnamed-chunk-16-2.png" width="2400" style="display: block; margin: auto;" /> ] --- ## Length _Length_ is most commonly used in the context of bar charts. The longer a bar is, the greater the value. **Don't truncate bar charts, use length in its entirety!** -- .pull-left[ <img src="https://flowingdata.com/wp-content/uploads/2012/08/Bush-cuts.png"> ] -- .pull-right[<img src="https://flowingdata.com/wp-content/uploads/2012/08/Fox-chart-corrected.png">] ] .small[**BAD** FOX news, again...] --- ## Angle _Angles_ range from 0 to 360 degrees in a circle. .pull-left[ ### Considerations * Angles are most associated with _pie charts_. Pie chart is made up of parts that make up a whole. * Don't use too many categories (bar chart is better) * **The sum of all percentages should equal 100%!** ] -- .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2009/11/Fox-News-pie-chart.png" width=500> ] --- ## Don't even think about this! .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/2b-angles-unclear-750x521.jpg">] --- ## Slope _Slope_ is similar to _angle_. Line charts are the most common use of slope to encode data. .pull-left[ ### Considerations * Slope magnitude: steeper = greater change, flatter = lesser change * The aspect ratio * Visual change should match the context of the change **Cleveland, McGIll & McGill (1988)** suggested that the average slope in a line chart should be `\(45^o\)`, in order to make neutral comparisons between lines This is still a good rule of thumb ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/slope-scope-750x248.png" align="center"> <img src="data:image/png;base64,#module6_files/figure-html/unnamed-chunk-17-1.png" width="2400" style="display: block; margin: auto;" /> ] --- ## Area Like _length_, _area_ can be used to represent data with size, but with two dimensions instead of one. .pull-left[ ### Considerations * While the encoding might not be as precise from a visual perception perspective, area can provide a more intuitive, less abstract view for some types of data * Make sure you scale by area, not edge (remember, area gets squared per unit increase) - This means you should encode the length of a side as `\(\sqrt{x}\)` ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/area-incorrect-750x427.png">] --- ## A _treemap_ uses rectangle areas to show hierarchical data .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/spending-treemap-750x574.png" width=700>] .footnote[taken from **Flowing Data**] --- ## Volume _Volume_ can used in the same way as _area_ but has one more dimension. .pull-left[ ### Considerations * Make sure you scale by volume, not edge (remember, volume gets cubed per unit increase) - This means you would encode the side of a "box" as `\(x^{1/3}\)` For 3-D encodings, you need to take the volume as proportional to the data ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/volume-incorrect-750x399.png"> <img src="img/volumes.png" height=200></img> ] --- ## The volume, or 3D perspective representation can make tangible data more relatable .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/copper03-750x600.jpg" width=700>] --- ## Color _Color_ as a visual encoding can be split into two categories: **hue** and **saturation**. **Hue** is what most people refer to as color (red, green, blue, etc.) **Saturation** is the amount of **hue** in a color. .left-column30[ * Qualitative: every color represents a distinct attribute (category) * Sequential: color represents a range (**saturation**) from low to high (or vice-versa) * Diverging: multiple hues represent a point of inflection of the data ] .right-column70[ <img src="img/brewer-scales.png"> ] --- ## Sequential example .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/Jeopardy-game-board-double-750x473.png" width=600>] --- ## Another sequential example .center[<img src="https://flowingdata.com/wp-content/uploads/2019/02/3-d-population-1090x419.png">] --- ## Divergent example .center[<img src="https://flowingdata.com/wp-content/uploads/2017/12/Tax-change-720x334.png">] --- background-image: url('img/colormaps.png/colormaps.png.001.jpeg') background-size: contain <div style="margin-bottom:1px; margin-left:1px; width:400px; height:150px;position:fixed;bottom:0; "> Most of these palettes are available to both <b>ggplot2</b> and <b>matplotlib</b>. For R, you may have to load packages like <b>RColorBrewer</b> or <b>viridis</b> </div> --- ## Consider color blindness .pull-left[<img src="https://flowingdata.com/wp-content/uploads/2020/03/colorblind-750x281.png">] .pull-right[<img src="img/colorblindness.png">] --- ## Consider printing .center[<img src="img/printing.png">] --- ## Color can provide context .center[<img src="https://flowingdata.com/wp-content/uploads/2019/12/Christmas-trees-750x542.png" width=600>] .small[Where Christmas trees come from] --- ## In summary, work with the following attributes to encode your data .pull-left[ * Position * Length * Angle * Slope * Area ] .pull-right[ * Volume * Color * Density \* * Shape \* ] Or any combination thereof! .small[\* not discussed today] --- class: inverse # Visualization critiques: what is wrong with this picture? * What is the first thing you notice about this visualization? * What point is the visualization trying to make? * Who is the intended audience? * What is the visualization doing well? * What problems do you see with the visualization design? --- ## Bad example 1 .center[<img src="img/bad1.png">] --- ## Bad example 2 .center[<img src="img/bad2.png">] --- ## Bad example 3 .center[<img src="img/bad3.png">] --- ## Bad example 4 .center[<img src="img/bad4.png">] --- ## Bad example 5 .center[<img src="img/bad5.png">] --- ## Some cool visualization links [Visual Capitalist](https://multithreaded.stitchfix.com/blog/2020/09/02/what-color-is-this/) [Scientific American: The Pulsar Chart That Became a Pop Icon Turns 50: Joy Division’s Unknown Pleasures](https://www.scientificamerican.com/article/the-pulsar-chart-that-became-a-pop-icon-turns-50-joy-division-rsquo-s-unknown-pleasures/) [Stitch Fix: What Color is This?](https://multithreaded.stitchfix.com/blog/2020/09/02/what-color-is-this/) [Reddit (yes, Reddit) r/dataisbeautiful](https://www.reddit.com/r/dataisbeautiful/) --- ## What makes a readable graphic? -- * It depends on who you ask -- * Many go by the _data-ink ratio_ as described by Tufte: .pull-left[ > A large share of ink on a graphic should present data-information, the ink changing as the data change. Data-ink is the non-erasable core of a graphic, the non-redundant ink arranged in response to variation in the numbers represented. ] .pull-right[ <img src="https://infovis-wiki.net/w/images/5/55/DIR.jpg"> ] -- * It depends --- ## Data is fluid and visualization represents that fluidity * Real world is complicated * There are visualization rules that cannot be broken related to the technical aspects of how a chart is constructed * However, there are principles and guidelines (fuzzier aspects of chart design) that you need to adapt to the data and the context: - The baseline _always_ needs to start at zero. _But what if the data has no zeros?_ - Pie charts are terrible, never use them. _But people know how to read pie charts and it's fine for this specific dataset._ - A bar chart would have been better. _Insert some snarky remark here._ --- ## Tradeoffs When visualizing for _an audience_ there are always factors to consider that can conflict with visual efficiency -- ### A _readable chart_: -- * Provides **clarity** (removes confusion) -- * Has a **clear purpose** -- * Uses **visual encodings that make sense** for the **context** of the data -- * Has a **clear direction for how to interpret** --- class: inverse, middle, center # Visual Hierarchy --- class: middle # When you make a chart using _default settings_, you usually get a flat graphic where everything — from the tick marks, to the encoded data, to the title — gets the same amount of importance visually --- ## Lines, colors, border box, etc. are on the same level of importance as the data itself. Nothing stands out. .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/hierarchy-defaults-750x326.png">] --- ## Small adjustments can help the data appear more prominently and the other parts move back to support. .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/hierarchy-shades-750x326.png">] --- ## It’s more obvious what part of the chart is the actual data and what part is background context .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/hierarchy-placement-750x326.png">] --- ## Example .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/hierarchy-us-emissions-750x539.png" width=600>] .small[[Interactive in NYT's site](https://www.nytimes.com/interactive/2019/02/13/climate/cut-us-emissions-with-policies-from-other-countries.html)] --- ## Color contrast .pull-left[ * Color makes parts of chart stand out * Brighter and bolder appear more prominent than greyed or faded colors * To increase the visibility of your data, make it appear _higher_ in the visual hierarchy .small[[20 Years, 20 Titles - Roger Federer](https://www.srf.ch/static/srf-data/data/2018/federer/#/en)] ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2018/01/Roger-Federer-the-winner-720x586.png"> ] --- ## Size .pull-left[ * Objects that use more space on the screen or paper will naturally draw more attention * Vary the sizes in your chart to bring more attention to points of interest * One obvious case is the size of text .small[[Salary and Occupation](https://flowingdata.com/2019/11/18/salary-and-occupation/)] ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/Salary-and-Occupation-750x822.png" height="80%" width="80%"> ] --- ## Placement .pull-left[ * Where you put your data — top, bottom, left, right — also affects visual hierarchy * Things placed at the top of a chart appear more important than things placed at the bottom. * For example, in government and politics, _left_ and _right_ might be linked to certain ideologies .small[[Are you a Democrat or a Republican?](https://www.nytimes.com/interactive/2019/08/08/opinion/sunday/party-polarization-quiz.html)] ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/decision-tree-1536x1326.png"> ] ??? Turn the whole chart around by 90 degrees to place Democratic groups on the bottom and Republican groups on the top, and the placement could be taken the wrong way. In this case, the left and right placement puts the two ideologies on the same level. --- ## Highlighting .pull-left[ * Use highlighting to call out specific areas of a visualization to direct readers’ eyes to what is important .small[[Hurricane Maria](https://www.washingtonpost.com/graphics/2017/national/puerto-rico-hurricane-recovery/)] ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/Hurricane-Maria-1536x1120.png"> ] --- ## Layering * Think of the visual hierarchy as layers - The most important items gets placed on the top of the stack - Items that are less important, or rather, more boilerplate, can fall to the back * The layering metaphor is especially helpful when you implement or design your visualization. * For example, Adobe Illustrator or Inkscape already uses layers, so you can stack things on top of each other based on your goals * If you’re using code, the code for a bottom layer tends to run before the top layers. From the reader perspective, it’s more obvious where to focus attention. They can spend less time trying to interpret the chart and the data and more time understanding your own interpretations of the data. --- ## Give this flat chart some hierarchy .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/flat-line-chart-1536x1436.png" width=500>] --- ## Much better! .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/social-security-card-holders.png" width=500>] > The [Golden Ratio](https://en.wikipedia.org/wiki/Golden_ratio), also known as the _divine ratio_ is a fundamental ratio in nature, and creates an aesthetically pleasing balance between dimensions. The ratio is `\(\frac{1 + \sqrt{5}}{2} \approx 1.618\)`. Often we'll see rectangles where the longer side and shorter side are in this ratio. --- class: inverse, center, middle # Providing Context --- ## Tips for Providing Context * Annotation * Tone * Direct Labeling * Font Selection * Point of Reference --- ## Annotation * Annotation is the quickest and most straightforward way to add context to your charts. However, under the false security of “letting the data speak”, oftentimes these words are missing from default charts. * Add the extra layer of information, and you draw attention to specific areas and points, help explain visual encodings, and describe what a reader is seeing. * Words can set expectations, so that readers know what they’re about to see. Here’s Hidy Kong on her group’s research on visualization titles: * Visualization titles influence how people interpret, perceive bias in, and trust data visualizations. * Sometimes it doesn’t even matter that a title contradicted the chart. The title could say that something increased over time when the chart showed a clear decrease, and **the reader would take away the context of the title over the chart.** --- ## Tone .pull-left[ * The words you use describe your data can change the tone of your charts, which can change how people interpret them * Using casual language could signal to readers that your chart presents a less serious topic * Using more technical language might seem like it was meant for a technical audience * Choose your words wisely ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2019/01/3-pt-midrange-line-chart-2-750x657.png"> ] .small[[Goodbye, midrange shot](https://flowingdata.com/2019/01/15/goodbye-mid-range-shot/)] --- ## Direct Labeling * Most visualization software lets you add legends to your charts to describe what each visual encoding represents * The challenge for readers is that they have to refer to the legend and **look away** from the actual chart * Try to directly label visual encodings However... * Most statistical software (R, Python, MATLAB, etc) cannot do words on graphics and typography - This situation is improving with newer packages * These programs lack control of typography – in contrast to maps, where words can be placed directly on the labelled map element. * Additional post-processing is usually done with Illustrator or some other tool --- ## Font Selection .pull-left[ There are primarily two classes of annotations: 1. labels that help readers decode the visualization (axis, tick, and category labels) 1. annotations that explain the data, which is usually required to provide contextfor a specific dataset Nathan Yau uses `monospace fonts` general labels and an _italicized serif font_ for contextual annotation ] .pull-right[<img src="https://flowingdata.com/wp-content/uploads/2020/03/different-text.png">] .small[[Reaching $100k in savings](https://flowingdata.com/2019/10/29/when-people-reach-100k-in-savings/)] --- ## Point of Reference .pull-left[ * Visualization is all about comparison * If it is difficult to compare visual encodings, then it is difficult to interpret a chart, much less get anything useful out of it * Providing a point of reference is a straightforward remedy * With time series data, it can be useful to use a specific time as a point of reference ] .pull-right[ <img src="https://flowingdata.com/wp-content/uploads/2020/03/Marrying-age-1536x846.png">] .small[[Marrying Age](https://flowingdata.com/2016/03/03/marrying-age/)] --- class: inverse, middle, center # Aesthetics --- ## Aesthetics are subjective and can provide more _clarity_ Put effort into aesthetics, and it can help readers understand your charts better and also differentiate your own style .pull-left[ **Aesthetics can provide the following:** * Beauty * Readability * Identity * Expectations ] .pull-right[ **Elements of aesthetics:** * Organization and arrangement * Sizes and weights * Color palette * Medium ] --- class: inverse, middle, center # Tufte's Principles [Tufte's Rules (requires Adobe Flash)](http://sealthreinhold.com/school/tuftes-rules/rule_five.php) --- ## The Visual Display of Quantitative Information .pull-left[<img src="img/visual-display.png" width=400>] .pull-right[This book discusses **statistical graphics**, **charts and tables**, as well as the theory behind the **design of information graphics or data graphics**. The book goes into a **detailed analysis of successful ways in which to display complex,** statistical information with quick, easy and **effective design techniques.** The first edition was published in 1983. ] --- ## Envisioning Information .pull-left[ <img src="img/envisioning.png" width=400>] .pull-right[ This book tackles the problem of **conveying multiple variable information on a 2-d space.** It teaches us ways in which we can **communicate more information per unit to make good**, clear and smart presentations. This book won 17 awards, and was published in 1990. ] --- ## Tufte's Principles of Graphical Integrity 1. Show data variation, not design variation - Don't get fancy, let the data speak 1. Do not use graphics to quote data out of context - Maintain accuracy 1. Use clear, detailed, thorough labelling. - Use annotations to make your point 1. Representation of numbers should be directly proportional to numerical quantities - This is essential for fair representation 1. Don't use more dimensions than the data require - Be appropriate in use of 3D graphics, for example --- ## Tufte's Fundamental Principles of Design 1. Show comparisons 1. Show causality 1. Use multivariate data 1. Completely integrate modes (like text, images, numbers) 1. Establish credibility 1. Focus on content --- ## Sparklines Invention of the **sparkline**, most commonly used in stock activity. > A sparkline is a small intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphic can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic. .tiny[Tufte, May 27, 2004] .center[<img src="img/sparklines.png" width=600>] --- ## Small Multiples .pull-left[ The **small multiples** method is one that Tufte uses often to portray multiple graphs of information. >At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution. .tiny[Tufte, Envision Information, page 67] ] .pull-right[ <img src="img/small-multiples.png"> ] --- ## Graphical Integrity .small[ * The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented * Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data. * Graphics must not quote data out of context ] .center[<img src="img/integrity.png" width=600>] --- ## Data-Ink .pull-left[ A large share of ink on a graphic should present data-information, the ink changing as the data change. **Data-ink is the non-erasable core of a graphic**, the non-redundant ink arranged in response to variation in the numbers represented. How to maximize the data-ink ratio, within reason: 1. Erase non-data-ink, within reason 2. Erase redundant data-ink 3. Revise and edit] ] .pull-right[<img src="img/dataink.png">] --- ## Chartjunk .pull-left[ * Forgo chartjunk, including moiré vibration, the grid and the duck * The interior decoration of graphics generates a lot of ink that does not tell the viewer anything new. * The purpose of decoration varies — to make the graphic appear more scientific and precise, to enliven the display, to give the designer an opportunity to exercise artistic skills. * **All non-data- ink or redundant data-ink is often chartjunk.** ] .pull-right[ <img src="img/chartjunk.png"> ] --- ## Multifunctioning Graphical Elements .pull-left[ * **Mobilize every graphical element**, perhaps several times over, to show the data. * The graphical element that actually locates or plots the data is the data measure. * The complexity of multifunctioning elements can sometimes **turn data graphics into visual puzzles**, crypto- graphical mysteries **for the viewer to decode**. ] .pull-right[ <img src="img/puzzle.png" height=400> ] --- ## Escaping Flatland .small[ * Introduce **multiple dimensions on a two-space surface** * Focus more on the point than on the presentation, good design strategies are transparent. * Find **pattern** * Words may not be the most appealing to everyone but symbols are universal and understood by all * **More small images in sequence** allow more comparison with your eyes and a better understanding ] .center[<img src="img/shirts.png" width=600>] --- ## Layering and Separation Would Tufte approve of this diagram? .center[<img src="img/flags1.png">] --- ## Layering and Separation No! He would not. To make the visual depictions more effective, reduce them down, by using: * Macro annotation, which can help explain micro detail * Use light, color and space effectively * Remove the weight, avoid vibration .center[<img src="img/flags2.png">] --- class: inverse, middle, center # Bringing it Together --- ## So far you've learned about * **designing for an audience** -- * **picking the right visualization** -- * **making readable graphics** -- ## Now what? --- ## Practice, practice, practice .center[<img src="https://media.giphy.com/media/Y2c1ZjXVHRdSr1YojF/giphy.gif">] --- ## You make awesome charts .center[<img src="https://64.media.tumblr.com/602e8a73e8230d7bb91205342ea88a33/tumblr_qemlmmsPHe1sgh0voo1_1280.png" width=600>] --- ## You are a Dataviz G.O.A.T. (greatest of all time) .center[<img src="https://media1.giphy.com/media/rTfN2FHPPTABy/giphy.gif">] --- class: inverse ## nah! .center[<img src="https://media2.giphy.com/media/W5YVAfSttCqre/giphy.gif">] --- ## You work with _MORE_ data! .center[<img src="https://media1.giphy.com/media/mD4Z5Iz80f0Kv1vC5H/giphy.gif">] -- * more _complexity_ -- * larger files -- * missing values -- * incorrect encodings --- ## Show something instead of showing everything .center[<img src="https://flowingdata.com/wp-content/uploads/2019/06/Running-cities.png" width=600>] .small[[Where people run](https://flowingdata.com/2014/02/05/where-people-run/)] --- ## Too much leads to overload .center[<img src="https://flowingdata.com/wp-content/uploads/2019/11/kpi-overload-750x537.jpg" width=600>] .small[[KPI Overload](https://marketoonist.com/2019/11/kpi-overload.html)] --- ## Start asking questions .center[#ASK -> ANSWER -> ASK NEW -> ANSWER AGAIN -> REPEAT] -- * What does the data look like? * Does anything stand out? * What is the mean and median? ** Start simple and work your way up to more complex questions** --- class: inverse, center, middle # Break it down --- ## Highlighting specific parts of a dataset .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/Screen-Shot-2020-02-07-at-12.01.56-PM-750x375.png" width=600>] .small[[The Rich really pay lower taxes than you](https://www.nytimes.com/interactive/2019/10/06/opinion/income-tax-rate-wealthy.html)] --- ## Multiple charts .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/obesity-rates-750x647.png"width = 500>] .small[[Mapping the spread of obesity](https://flowingdata.com/2016/09/26/the-spread-of-obesity/)] --- ## Linked views (dynamic) .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/linked-views-fivethirtyeight.gif">] .small[[How unpopular is Donald Trump](https://projects.fivethirtyeight.com/trump-approval-ratings/)] --- class: inverse, center, middle # Build it up --- ## Follow the data .center[<img src="https://flowingdata.com/wp-content/uploads/2020/03/follower-factor-animation.gif" width=500>] .small[[The Follower Factory](https://www.nytimes.com/interactive/2018/01/27/technology/social-media-bots.html)]